27 research outputs found
Cross-domain Voice Activity Detection with Self-Supervised Representations
Voice Activity Detection (VAD) aims at detecting speech segments on an audio
signal, which is a necessary first step for many today's speech based
applications. Current state-of-the-art methods focus on training a neural
network exploiting features directly contained in the acoustics, such as Mel
Filter Banks (MFBs). Such methods therefore require an extra normalisation step
to adapt to a new domain where the acoustics is impacted, which can be simply
due to a change of speaker, microphone, or environment. In addition, this
normalisation step is usually a rather rudimentary method that has certain
limitations, such as being highly susceptible to the amount of data available
for the new domain. Here, we exploited the crowd-sourced Common Voice (CV)
corpus to show that representations based on Self-Supervised Learning (SSL) can
adapt well to different domains, because they are computed with contextualised
representations of speech across multiple domains. SSL representations also
achieve better results than systems based on hand-crafted representations
(MFBs), and off-the-shelf VADs, with significant improvement in cross-domain
settings
Textual properties and task based evaluation : investigating the role of surface properties, structure and content
This paper investigates the relationship between the results of an extrinsic, task-based evaluation of an NLG system and various metrics measuring both surface and deep semantic textual properties, including relevance. The latter rely heavily on domain knowledge. We show that they correlate systematically with some measures of performance. The core argument of this paper is that more domain knowledge-based metrics shed more light on the relationship between deep semantic properties of a text and task performance.peer-reviewe
If it may have happened before, it happened, but not necessarily before
Temporal uncertainty in raw data can impede
the inference of temporal and causal relationships
between events and compromise the output
of data-to-text NLG systems. In this paper,
we introduce a framework to reason with and
represent temporal uncertainty from the raw
data to the generated text, in order to provide a
faithful picture to the user of a particular situation.
The model is grounded in experimental
data from multiple languages, shedding light
on the generality of the approach.peer-reviewe
The Importance of Narrative and Other Lessons from an Evaluation of an NLG System that Summarises Clinical Data
This research was funded by the UK Engineering and Physical Sciences Research Council, under grant EP/D049520/1.Publisher PD
Towards a possibility-theoretic approach to uncertainty in medical data interpretation for text generation
Many real-world applications that reason about events obtained from
raw data must deal with the problem of temporal uncertainty, which arises due to error or inaccuracy in data. Uncertainty also compromises reasoning where relationships between events need to be inferred. This paper discusses an approach to dealing with uncertainty in temporal and causal relations using Possibility Theory, focusing on a family of medical decision support systems that aim to generate textual summaries from raw patient data in a Neonatal Intensive Care Unit. We describe a framework to capture temporal uncertainty and to express it in generated texts by mean of linguistic modifiers. These modifiers have been chosen based on a human experiment testing the association between subjective certainty about a proposition and the participants’ way of verbalising it.peer-reviewe
Text content and task performance in the evaluation of a natural language generation system
An important question in the evaluation of Natural Language Generation systems concerns the relationship between textual characteristics and task performance. If the results of task-based evaluation can be correlated to properties of the text, there are better prospects for improving the system. The present paper investigates this relationship by focusing on the outcomes of a task-based evaluation of a system that generates summaries of patient data, attempting to correlate these with the results of an analysis of the system’s texts, compared to a set of gold standard human-authored summaries.peer-reviewe
The importance of narrative and other lessons from an evaluation of an NLG system that summarises clinical data
The BABYTALK BT-45 system generates textual summaries of clinical data about babies in a neonatal intensive care unit. A recent task-based evaluation of the system suggested that these summaries are useful, but not as effective as they could be. In this paper we present a qualitative analysis of problems that the evaluation highlighted in BT-45 texts. Many of these problems are due to the fact that BT-45 does not generate good narrative texts; this is a topic which has not previously received much attention from the NLG research community, but seems to be quite important for creating good data-to-text systems.peer-reviewe
The role of graduality for referring expression generation in visual scenes
Referring Expression Generation (reg) algorithms, a core component of systems that generate text from non-linguistic data, seek to identify domain objects using natural language descriptions. While reg has often been applied to visual domains, very few approaches deal with the problem of fuzziness and gradation. This paper discusses these problems and how they can be accommodated to achieve a more realistic view of the task of referring to objects in visual scenes.peer-reviewe